Search CORE

29 research outputs found

From Bare Metal to Virtual: Lessons Learned when a Supercomputing Institute Deploys its First Cloud

Author: Bell T.
Merkel Dirk
Services Amazon Web
Weil Sage A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/07/2018
Field of study

As primary provider for research computing services at the University of Minnesota, the Minnesota Supercomputing Institute (MSI) has long been responsible for serving the needs of a user-base numbering in the thousands. In recent years, MSI---like many other HPC centers---has observed a growing need for self-service, on-demand, data-intensive research, as well as the emergence of many new controlled-access datasets for research purposes. In light of this, MSI constructed a new on-premise cloud service, named Stratus, which is architected from the ground up to easily satisfy data-use agreements and fill four gaps left by traditional HPC. The resulting OpenStack cloud, constructed from HPC-specific compute nodes and backed by Ceph storage, is designed to fully comply with controls set forth by the NIH Genomic Data Sharing Policy. Herein, we present twelve lessons learned during the ambitious sprint to take Stratus from inception and into production in less than 18 months. Important, and often overlooked, components of this timeline included the development of new leadership roles, staff and user training, and user support documentation. Along the way, the lessons learned extended well beyond the technical challenges often associated with acquiring, configuring, and maintaining large-scale systems.Comment: 8 pages, 5 figures, PEARC '18: Practice and Experience in Advanced Research Computing, July 22--26, 2018, Pittsburgh, PA, US

arXiv.org e-Print Archive

Crossref

Recovering Residual Forensic Data from Smartphone Interactions with Cloud Storage Providers

Author: Amazon Web Services
Amazon Web Services
Biggs
Biggs
Brodkin
Chung
Constine
CRN
Delport
Distefano
Dykstra
Gartner
Glisson
Google
Grispos
Grispos
Grispos
Grispos
Grispos
Hay
Hay
Hengeveld
Hong
Hoog
Hunsinger
Ibrahim
Jansen
Jansen
Kaufman
Kholia
Lee
Lessard
Levinson
Mager
MarketWire
MarketWire
Martini
Martini
Morrissey
Mozy
Mulazzani
Phillips
Quick
Quick
Quick
Quick
Quick
Reilly
Ruan
Subashini
Taylor
TrendMicro
Zdziarski
Zissis
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

There is a growing demand for cloud storage services such as Dropbox, Box, Syncplicity and SugarSync. These public cloud storage services can store gigabytes of corporate and personal data in remote data centres around the world, which can then be synchronized to multiple devices. This creates an environment which is potentially conducive to security incidents, data breaches and other malicious activities. The forensic investigation of public cloud environments presents a number of new challenges for the digital forensics community. However, it is anticipated that end-devices such as smartphones, will retain data from these cloud storage services. This research investigates how forensic tools that are currently available to practitioners can be used to provide a practical solution for the problems related to investigating cloud storage environments. The research contribution is threefold. First, the findings from this research support the idea that end-devices which have been used to access cloud storage services can be used to provide a partial view of the evidence stored in the cloud service. Second, the research provides a comparison of the number of files which can be recovered from different versions of cloud storage applications. In doing so, it also supports the idea that amalgamating the files recovered from more than one device can result in the recovery of a more complete dataset. Third, the chapter contributes to the documentation and evidentiary discussion of the artefacts created from specific cloud storage applications and different versions of these applications on iOS and Android smartphones

arXiv.org e-Print Archive

CiteSeerX

Crossref

Enlighten

Explainable Clustering Applied to the Definition of Terrestrial Biomes

Author: Amazon Web Services AI
Kelley Douglas
Kim Alisa
Parker Robert
Sidoumou Mohamed
Swaminathan Ranjini
Walton Jeremy
Publication venue: 'Scitepress'
Publication date: 18/02/2022
Field of study

We present an explainable clustering approach for use with 3D tensor data and use it to define terrestrial biomes from observations in an automatic, data-driven fashion. Our approach allows us to use a larger number of features than is feasible for current empirical methods for defining biomes, which typically rely on expert knowledge and are inherently more subjective than our approach. The data consists of 2D maps of geophysical observation variables, which are rescaled and stacked to form a 3D tensor. We adapt an image segmentation algorithm to divide the tensor into homogeneous regions before partitioning the data using the k-means algo- rithm. We add explainability to the classification by approximating the clusters with a compact decision tree whose size is limited. Preliminary results show that, with a few exceptions, each cluster represents a biome which can be defined with a single decision rule

Central Archive at the University of Reading

Distributed Model-to-Model Transformation with ATL on MapReduce

Author: Bergmann G.
Burgueño L.
Clasen C.
Gómoz A.
Horn T.
Kärnä J.
Lam M.
Object Management Group
Services Amazon Web
Tisi M.
Yie A.
Publication venue: HAL CCSD
Publication date: 26/10/2015
Field of study

International audienceEfficient processing of very large models is a key requirement for the adoption of Model-Driven Engineering (MDE) in some industrial contexts. One of the central operations in MDE is rule-based model transformation (MT). It is used to specify manipulation operations over structured data coming in the form of model graphs. However, being based on com-putationally expensive operations like subgraph isomorphism, MT tools are facing issues on both memory occupancy and execution time while dealing with the increasing model size and complexity. One way to overcome these issues is to exploit the wide availability of distributed clusters in the Cloud for the distributed execution of MT. In this paper, we propose an approach to automatically distribute the execution of model transformations written in a popular MT language, ATL, on top of a well-known distributed programming model, MapReduce. We show how the execution semantics of ATL can be aligned with the MapReduce computation model. We describe the extensions to the ATL transformation engine to enable distribution, and we experimentally demonstrate the scalability of this solution in a reverse-engineering scenario

HAL - Normandie Université

Crossref

INRIA a CCSD electronic archive server

HAL Mines Nantes

Ticket to Talk: Supporting Conversation between Young People and People with Dementia through Digital Media

Author: Alzheimer's Association
Alzheimer's Society
Alzheimer's Society
Alzheimer's Society
Brankaert Rens
Brooker Dawn
Fips N
Killick John
Kitwood Tom
NHS.
Otwell Taylor
Oyebode Femi
Services Amazon Web
Svensson Marcus Sanchez
Xamarin Inc.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

We explore the role of digital media in supporting intergenerational interactions between people with dementia and young people. Though meaningful social interaction is integral to quality of life in dementia, initiating conversation with a person with dementia can be challenging, especially for younger people who may lack knowledge of someone’s life history. This can be further compounded without a nuanced understanding of the nature of dementia, along with an unfamiliarity in leading and maintaining conversation. We designed a mobile application - Ticket to Talk - to support intergenerational interactions by encouraging young people to collect media relevant to individuals with dementia to use in conversations with people with dementia. We evaluated Ticket to Talk through trials with two families, a care home, and groups of older people. We highlight difficulties in using technologies such as this as a conversational tool, the value of digital media in supporting intergenerational interactions, and the potential to positively shape people with dementia’s agency in social settings

Northumbria Research Link

Crossref

Irish Universities

Edinburgh Research Explorer

Cork Open Research Archive

Explore Bristol Research

Randomness Concerns When Deploying Differential Privacy

Author: Boneh Dan
Das S.
Gutterman Z.
Hotz V. Joseph
Inayah K.
Mironov Ilya
National
Naveed Muhammad
Noll Landon Curt
O'Neill Melissa E.
Pokam Gilles
Ruhault Sylvain
Salmon J. K.
Services Amazon Web
Shema Mike
Sheppard Kevin
Shrimpton Thomas
US Census Bureau
von Neumann Jon
Weaver V. M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/09/2020
Field of study

The U.S. Census Bureau is using differential privacy (DP) to protect confidential respondent data collected for the 2020 Decennial Census of Population & Housing. The Census Bureau's DP system is implemented in the Disclosure Avoidance System (DAS) and requires a source of random numbers. We estimate that the 2020 Census will require roughly 90TB of random bytes to protect the person and household tables. Although there are critical differences between cryptography and DP, they have similar requirements for randomness. We review the history of random number generation on deterministic computers, including von Neumann's "middle-square" method, Mersenne Twister (MT19937) (previously the default NumPy random number generator, which we conclude is unacceptable for use in production privacy-preserving systems), and the Linux /dev/urandom device. We also review hardware random number generator schemes, including the use of so-called "Lava Lamps" and the Intel Secure Key RDRAND instruction. We finally present our plan for generating random bits in the Amazon Web Services (AWS) environment using AES-CTR-DRBG seeded by mixing bits from /dev/urandom and the Intel Secure Key RDSEED instruction, a compromise of our desire to rely on a trusted hardware implementation, the unease of our external reviewers in trusting a hardware-only implementation, and the need to generate so many random bits.Comment: 12 pages plus 2 pages bibliograph

arXiv.org e-Print Archive

Crossref

Fault tolerant internet computing: Benchmarking and modelling trade-offs between availability, latency and consistency

Author: Abadi
Alexander Romanovsky
Alshayeji
Alturkistani
Amazon Web Services
Amin
Anatoliy Gorbenko
Avizienis
Baker
Bakr
Behera
Brewer
Brewer
Brutlag
Cardellini
Chawathe
Chen
Chen
Condliffe
Dabek
Elsayed
Garraghan
Gilbert
Gill
Gorbenko
Gorbenko
Gorbenko
Gorbenko
Heisenberg
Izrailevsky
Lakshman
Lamport
Lee
Mansouri
Olga Tarasyuk
Potharaju
Privitera
Rao
Reinecke
Reinecke
Smith
Tanenbaum
Tarasyuk
Van Moorsel
Vargas-Santiago
Varshney
Walpole
Zheng
Zhong
Publication venue: 'Elsevier BV'
Publication date: 15/11/2019
Field of study

The paper discusses our practical experience and theoretical results of investigating the impact of consistency on latency in distributed fault tolerant systems built over the Internet and clouds. We introduce a time-probabilistic failure model of distributed systems that employ the service-oriented paradigm for defining cooperation with clients over the Internet and clouds. The trade-offs between consistency, availability and latency are examined, as well as the role of the application timeout as the main determinant in the interplay between system availability and responsiveness. The model introduced heavily relies on collecting and analysing a large amount of data representing the probabilistic behaviour of such systems. The paper presents experimental results of measuring the response time in a distributed service-oriented system whose replicas are deployed at different Amazon EC2 location domains. These results clearly show that improvements in system consistency increase system latency, which is in line with the qualitative implication of the well-known CAP theorem. The paper proposes a set of novel mathematical models that are based on statistical analysis of collected data and enable quantified response time prediction depending on the timeout setup and on the level of consistency provided by the replicated system

Crossref

Leeds Beckett Repository

A measurement study of google play

Author: Chirgwin R.
Edward Garcia
Jason Nieh
Nicolas Viennot
Services Amazon Web
Services Amazon Web
Services Amazon Web
Warren C.
Zhou Y.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

Virtualization of Storage and Systems

Author: Google
Microsoft
S. Hanks
Services Amazon Web
Services Amazon Web
Switch Open
Wikipedia
Xen
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref